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1 INTRODUCTION 


The Theory Of Linear and Non Linear Regression Analysis 
involves modeling of data points (of the Independent Variables 
and the Dependent Variable) for finding any Trends, both of 
Linear and Non Linear kind in them. Some of these concepts are 
detailed in the works mentioned in the References section. 


In this research manuscript, the author has advented the 
comprehensive Holistic Theoretical Model For Optimal Multiple 
Linear and Multiple Non Linear Regression Analysis. Also, an 
Exhaustive Error Modeling Scheme is detailed to perfect the 
advented Model. 


HOLISTIC THEORETICAL MODEL FOR OPTIMAL MULTIPLE LINEAR AND 
MULTIPLE NON LINEAR REGRESSION ANALYSIS 


2 OPTIMAL MULTIPLE LINEAR REGRESSION 


Firstly, we consider Linear Regression Analysis with one 
independent variable. Let the independent variable be denoted 
by x, and the dependent variable be denoted by y,. The number 
of data points bey) considered are 1 in number. A linear 
relationship between x, and y, for any pair say (ary) with 
1<k <ncan be written as 

y, =mx, +c, Equation 1 

Therefore, we can generalize the above for all n pairs of pints as 
y; =mx,+c; Equation 2 

where i=1 to nand m represents the Slope of the Straight Line 
and c, represents the y ordinate. 


Now, we can apply the Summation Operator on the above 
equation (over all 7 points) giving us 


yy, = > mx, +ne+ Die) Equation 3a 
i=l i=l 


where C is a constant. 
It can be noted that in Equation 3a, we have assumed that 


n n 
nc= UG + 6; Equation 3b 
i=l i=l 


where the ¢, are the errors in the y- ordinate value 
aforementioned linear approximation, when they intercept 
value is supposed to be c for all the data points (x : y;) when we 


linearly relate them with a Straight Line with Slope m. 
We now divide the entire of Equation 3a by n giving us 


n n n 
ay y,= 1S mx, he + ye, Equation 4 
Nn j= nN j= Nn j= 
which can be written as 
y=mxX+c+é Equation 5 
where y and xX can be computed as they are simple Arithmetic 


Mean values. 
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If we ignore the € which is the Arithmetic Mean of the 
aforementioned errors, for our analysis, we can note that 
y=mx+c Equation 6 

Could best represent the Average Linear Relationship between 
the data points x, and y, as the x and y are Centroids of the 
x, and y, respectively. Therefore, we are left with the task of 


evaluating m and c for such an aforementioned Linear 
Relationship. 


In order to solve for m and c, we can note that we can simply 
use appropriate Scalar Variable Multipliers on Equation 2 and 
apply the Summation Operator through all the data points to 
generate another equation. That is, if we multiply the entire 
Equation 2 by x;, and applying the Summation Operator over 
all n points, we have 

ee = Simx? + cy) ar ye, Equation 7 

i=1 i=1 i=1 i=1 

which upon division by n becomes 


1 n 1 n Cc n 1 n 
—> x9; = — >) mx; +—)ox, + —S ex, Equation 8 
N j=1 N j=1 N j=1 N j=1 


. 1X ; 
Neglecting the last term —> é,x; for the convenience of our 
N j=! 


analysis as already discussed before, we have 


1< 1< > CXL : 
—> x,y; = — 5° mx; + -> x; Equation 9 
N j=1 N j=1 N j=1 


which can be written as 





xy= mx) +cx Equation 10 

Now, Equation 6 and Equation 10 can be solved using the 
theory of Simultaneous Linear Equations for the values of m 
and c. 

However, we can note that there are infinite number of ways in 
which a Scalar Variable Multiplier can be producted to 
Equation 2 and apply the Summation Operator on all the n 
data points to generate another Equation. 
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However, in this research, the author limits the research to only 
certain range of such Multipliers so as to design the Optimal 
outcome of the m and c values. We can note that we can also 
multiply Equation 2 entirely by y,, the Dependent variable to 
generate another equation akin to the kind of Equation 10 to 
solve for the values of -m and c using Equation 6. The 
restrictions for such Scalar Variable Multipliers are detailed as 
follows: 


Restrictions: 

1. We have to use all the Variables as such Scalar Variable 
Multipliers at least once. 

2. The highest degree of any of the terms on both sides of 
equation gotten by such aforementioned Scalar Variable 
Multipliers is equal to the sum of the number of 
Independent Variables and the number of Dependent 
Variables. 

For example, if we have one Dependent Variable y, and two 
Dependent Variables x, and x,, we have relations of the kind, 
after performing such aforementioned Scalar Variable Multiplier 
product operations: 

y=m,xX,+m,x,+c Equation 11 








yy) =m,x,y, +m,X,y¥, +cy, Equation 12 








2 . 
YX, =M,xX, +m,xX,x,+cx, Equation 13 








y,X, =m,X,X, +m,x,; +cx, Equation 14 

Since we have the degree of any of the term on either side of the 
Equations 12, 13 and 14 to be only 2, and the allowed level is 3 
as the sum of the number of Independent and Dependent 
Variables. 

Therefore, we can afford to multiply each of the Scalar 
Multiplier Variables x,, x, and y, on each of the Equations 12, 
13 and 14 once again to give us 3+3+3=9 Equations. They are: 





y, =m,x,y; +m,x,y, +cy, Equation 15 





yx, =m,X, y, +m,X,x,y,+cx,y, Equation 16 
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2 2 : 
VY, X, =M,X,X,Y, +M,X; y, +cy,xX, Equation 17 





=2 2 . 
y, X, =M,X, Y, +M,X,y,xX, +cy,xX, Equation 18 








2: 3 2 2 . 
YX, =MX; +M,xX,X, +cx, Equation 19 





= 2 2 , 
Y\X,X, = MX, X, +M,X;X, +cx,X, Equation 20 


And 





=o z . 
VY, X> =X, X,Y, +M,X; Y, +cy,xX, Equation 21 








2 2 : 
X5 VX, = MX; X, +M,X,X;, +cxX,xX, Equation 22 








2 2: 3 2 . 
YX, =MX,X,+m,X; +cx, Equation 23 


Therefore, we now have 3 Unknowns, namely, m,, m, and c 
and 13 Linear Equations. 

1 > Equation 11 

3 — Equations12, 13 and 14 

9 — Equatons 15, 16, 17, 18, 19,20, 21, 22, 23 

However, we have to note that the Best 3 Equations 
Combination among these 13 Equations gives us the best values 
of m,, m, and c. 

Also, we have to note that since Equation 11 is the Basic 
Equation from which the other 12 number of Equations are 
built, we ought to keep this one for our use as 1 among the 3 
Equations to be used for computing the best values of m,, m, 


and c. 

Therefore, we are reduced to the situation wherein we have to 
try to select any 2 Equations among the Equations numbered 12 
through 23 using all such Combinations of 2 Equations (there 


would be '*C, number of such pairs) to use each pair along with 
Equation 11 to evaluate a ar number of Set of values for m,, 
m, and c. The Set of Equations that gives us the Best Values of 
m,, m, and c is to be considered for final reporting of the 


Multiple Linear Regression Line. That is, this values Set gives us 


10 
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the Smallest Value of the Error, i.e., ¢ for the Multiple Linear 
Regression Line 

y=M,X,+m,xX,+c+é Equation 24 

In a similar fashion, we can generalize this analysis for more 


than 2 Independent Variables. 
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3 OPTIMAL MULTIPLE NON LINEAR 
REGRESSION 


The theory of Optimal Multiple Non-Linear Regression Analysis 
can be performed similarly to the Optimal Linear Regression 
Model by applying some changed Restrictions: 


Restrictions: 


1. We may not need to use all the Variables as such Scalar 
Variable Multipliers at least once. 

2. The highest degree of any of the terms on both sides of 
equation gotten by such aforementioned Scalar Variable 
Multipliers is equal to the sum of the number of 
Independent Variables and the number of Dependent 
Variables. 

3. The Independent Variables are the Variables themselves 
and their Squares, Cubes, etc., upto (raised to the power 
f). That is, f9 =" and the variables are detailed below: 


X= X15 X_ = X50 %3 = X]> ite »X, =X1> 
= — +2 = a 
X (gt) = %219% (42) = X19 X(ga3) = Kapeeeees M(gg) = %21 


= tye 23459) ees 
X(agt1) = %31>¥(2q42) = %319¥ (2943) = Lares Mageg) = %31 


_ — +2 _ _ 
X((f-)gtt) = %19 X(f-t)gr2) = %319 ¥(f-a)gs3) = Aare M(pajgra) = * 7 
For Example, 

Say if we have 

= = <2 = = 

Vy = IM Xy TM yXy~ FIM AX, FMyyXy 1 C Equation 25 

Here, the Degree of the above Equation is 2. And, the total 
a, ey 
X1>%1 >%2>%2_ Since, the 
Non-Linearity is observed as relationship of the Dependent 


number of Variables are 5, namely, Yi» 
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Variable with the Independent Variables while the Independent 
Variables are Integral Powers of Independent Variables 
themselves, we consider the Degree of Equation 25 to be 2. 
Hence, we can have the resulting equation after multiplying 
with Scalar Variables and applying the Summation Operator 
(on the n number of data points) in the following fashion, 
wherein the net degree of any of the Equation must not exceed 
5. 
From the Basic Equation 25 we generate 3 more Equations as 
follows 
Multiplying Equation 25 by 
y, gives us 
= ered = SS ee: = 
yy = MYX, + My, VX, FINg, Y{X_ FIMg YX + CY, Equation 26 
Similarly, 
Multiplying Equation 25 by 
x, gives us 
XY = MX +My X, +My XX) +My XX, +X, Equation 27 
Multiplying Equation 25 by 


xX, . 
2 gives us 





= 2 ~2 2B ais 
Xy Vp = Mi X_X, + IM yXyX, FIM, |X +My AX + CX, : 
quation 


Multiplying Equation 25 by 
—2 
a gives us 


aie. | <3 <4 =o — 2 <2 
X,Y, = MX, +N, X, +M~,X, x, + MN 4X, X 4 + CX; 





Equation 29 
Multiplying Equation 25 by 
—2 
a) gives us 
2— 2 2—2 ae gh eee? 
XY) = IM XX, FIM XX FM, |X FM AX, + CXz Equation 30 


We can note that the Highest Degree among the Set of 
Equations derived is still 4 and we can go upto 5. 

At this stage also, we use the best 5 Equations among the 
1+5=6 Equations and solve for 

IM 1, My, My), My, C 


such that the Error is Minimum compared to any other such 
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combination. 
Also, noting that we can Scalar Multiply the Equations 29 and 


=, oh l29 

30 of Degree 4, each with Seas 

giving 2x3=6 more Equations and also noting that we can 

Scalar Multiply the Equations 26, 27 and 28 (of Degree 3) each 
=a 

with “1 °*2 giving 3x2=6 more Equations. We now have a total 

of 

1 > Equation 25 

5 > Equations26, 27, 28, 29,30 

6 > Equatons gotten by multiplying Equations 29 and 


30 of Degree 4, each with 

y ? x, ? e 

6 — Equatons gotten by multiplying Equations 26,27 
and 30 of Degree3, each with 

—=2 ~2 

X19 Xo 

which are a total of 18 Equations. 


When Since, we have 5 unknowns and Equation 25 is the Basic 
Equation that generates the others, we have to form all possible 
groups of 4 Equations from the rest of the 17 Equations 
excepting Equation 25. These will be '’C, in number. Using 
each of these group of 4 Equations along with Equation 25, we 


find the Set of values of m,,, m,,, M,,, M5, a en A group. We 


M2, M1, My>C values 


then report that particular Set of Mage 
which best Minimizes the Error. 
Therefore, it is to be noted that Clever Algebraic Degree 
Balancing and Scalar Multiplier Variable selection to Multiply 
the Equations generated with, is necessary to sustain the 
Highest Possible Allowed Degree in the Equations generated 
from the Basic Equation to solve for Optimal Non-Linear 
Regression Coefficients. 
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4 GENERALIZED MODEL FOR OPTIMAL 
MULTIPLE LINEAR REGRESSION 


When we have r number of Independent Variables 
Xj 5Xq,Xz50-X,,X, and 1 Dependent Variable y,, each 


belonging to ", a linear relationship between x, (for i=1 tor) 


and y,(for j =1 ) can be written as 


= > m,x; +c Equation 31 
i=l 

where we have ignored the Error €, for our Analysis. 

Since the number of Independent Variables are r in number and 
Dependent Variable is 1 in number, the Highest Degree of the 
Equations Generated by producting Equation 31 with Scalar 
Variable Mutipliers, them being, x,,X,,X;,.....X,_),X, 18 (r+1). 
We now multiply Equation 31 by x,, throughout where 


s=l1tor giving us 


YjXjs = Yom, ,%;, +cx,, Equation 32 
i,s=l 


Applying the Summation Operator on Equation 32 throughout 
for j =lton, we get 


aie yYimx X ii x, +6, Equation 33a 


j=l i,s=1 


Dividing the Entire Pauaion 32 by n we get 


“YY jt, == ymax 5 re ee ae =X, Equation 33b 
j=l 


j=l i,s=1 Nn j= = 
which can be written as 


Y= Lym i z,+cD5, Equation 34 
f= 


j=l i,s=1 j=l 
which are Ss number of Equations, them being Equations 
(34+1) through Equations (34+s) 
Also, we multiply Equation 31 throughout by y, giving 
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= Yim,x,x, +cx, Equation (34+s+1) 
i,s=l 


Applying the Summation Operator on Equation (34+s+1) 
throughout for j =1 ton, we get 


Yiy2= => Ym, a FEL Equation (34+s+2)a 
j=l 


j=l i,s=1 
Dividing the above Peisesh throughout by n, we get 


“5 y3 =") Yomx At ay Equation (34+s+2)b 
j=l 


N j= =1 i,s=1 
which can be further written as 





y; =M,;X ,X, ee ae Equation (34+s+3) 


Similarly, eae Equation 31, we can generate the following Set 
of Equations 


y jXj, = Dmx phe HO, Equation (34+s+4)a 


Also, ae the Summation Operator throughout the points 
j =|ton, on Equation (34+st+4) we get 


aie yyim.x X ii x, tex, Equation (34+s+4)b 


j=l i=l j=l 
a can be further written as 


yaa Dm PE +E Equation (34+s+4)c 


Also, ae ee the Summation Operator through the points 
j =l|ton on Equation 31 gives 


vy, = Yyimx; +nc Equation (34+s+5)a 


jal j=l i=l 
Now dividing the above Equation (34+s+5)a by n, we get 


= )'m,x,,+¢ Equation (34+s+5)b 
i=l 
We now consider multiplying Equation (34+s+1) throughout by 
(r+1) 


y; — and get the following 
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yr) = Yims x,y, tcy; Equation (34+s+5)c 


Actually, we can note that Equation (34+s+5) whose highest 
degree is 1 can be rendered into an Equation with Highest 
Degree (r + 1) in the following fashion: 

That is, multiplying the following Equation (34+s+5)b 


= Dee +c throughout by yP xl?) where 


That is, there are r’ number of ways which will give r? number 
of Equations as i goes from | fo r. 

Let these be Equations (34+s+7) through Equations (34+s+7+ 
r°) 

Though we can note that the factors yr xia} also give many 
Equations, we will get the Error Minimization while Evaluating 
the Regression Coefficients m,'s and c only when we choose r 
number of best equations among the aforementioned r>? number 
of Equations and the Equation (34+s+5) to solve for the r 
number of m, co-efficients and the constant c which gives the 
Minimum Error €. The number of ways in which r number of 
equations can be selected from r° number of equations is i Ce. 
Also, we can note that Equation (34+s+5) whose highest degree 
is 1 can be reduced into an Equation with Highest Degree (r +1) 
in the following fashion as well: 

That is, by multiplying Equation (34+s+5) by a_ factor 


yo []>* where f,, can take values from | to r , where 
i=ty} 


Pe =(r—£,) Equation (34+s+5)d 


We can note that the Cardinality of {y,} belongs to the Set 
detailed as below: 
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T=({L, 2,3,....,(r-1r} and {y,}cI with i,f,,a, eT, ie. 
i,f,,a@; are Single Element Sub-Sets of [. Also, it is to be 
noted that we have to consider the aforementioned Multiplying 
Factor for all possible subsets 7, of T’. 

This also gives us some number of additional Equations, 
among which including the already generated Equations we 
select the Best r number of Equations to use along with 
Equation (34+s+5) to solve for the m,'s and c. For achieving 
this, we have to also find all possible sets of r number of 
Equations from the Complete Set of aforementioned generated 
Equations, excepting Equation (34+s+5). Such best Set of r 
number of Equations when used along with Equation (34+s+5) 
to solve for the m,'s and c minimizes the Error é€. 
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5 GENERALIZED MODEL FOR OPTIMAL 
MULTIPLE NON LINEAR REGRESSION 


When we have + number of Independent Variables and each of 
them is used a Polynomial of Degree d; where i goes from 
i=ltor, then the Non-Linear Relationship between the 
Independent and Dependent Variables can be written as follows: 
yee > mit, +c Equation (34+s+6) 
i=l j=l 

where @, €N, ie., the Set of Positive Integers starting from | 
and upto w. That is a, is a function from the Set {ul {nN}, 
for small N. We can note that the Degree of Equation (34+s+6) 
is Max( j). And the allowable Degree to which we can Modify 

i=ltor 

j=l toa, 


Equation (34+s+6) by applying Scalar Variable Multipliers y,, 
zee , is fs Sa,} which is the total number of Variables, both 
h=l 


Independent and Dependent. This is also the number of Scalar 
Variable Multipliers to begin with to apply these on Equation 


(34+s+6). These [Se number of Scalar Variable 
h=l 


Multipliers when acted on Equation (34+s+6) gives us 


{Ee number of Equations in addition to Equation 
h=l 


(34+s+6). Though the situation is that of wherein the number of 
unknowns are less than the number of Equations available to 
solve for the Regression Coefficients m,'s and c, we should note 


that this would not give us an Optimal Value of the Regression 
Coefficients that Minimizes the Error € . 


Therefore, we consider each of the {i+Sra,| number of 
h=l 
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Equations that are generated from the Basic Equation (34+s+6) 


and multiply each of the ; + Sa,} number of Scalar Variable 


H=l 


2 

Multipliers to give us now ji+Soe | number of Equations. 
H=l 

Now, we again check if the Degree of these Equations is less 


ual 


than the number |-da,| or not. If not, we repeat the 


process of multiplying each of the j+da,| Scalar Variable 


M=l 
Multipliers with each of the Equation generated fro the Basic 


2 
Equation until now, which are f + » «| + ! + > «| giving 
i=l 


i=l 
r r z i ? 
us pZe,f-feSe,! {-Za,} number of 
u=l Hal ual 


Equations. We keep repeating this procedure till we get the 
latest Set of generated Equations all having Degree greater than 


‘ 
{i-Srayl. Once, we achieve this, we consider all those 
h=l 


- 
generated equations whose Degree does not exceed {1s yee: | 


ual 


and find all possible groups of ; =p Sa | number of generated 
1=1 
equations from all the generated eer whose Degree does 
not Exceed t at y a| . We solve for the Regression Coefficients 
i=l 
m,,'8 and c using each such aforementioned group of generated 
equations and equation (34+s+6). We report those values of 


m as and c as the Regression Coefficients for which the Error 


é, is Minimum. 
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We can also note that in addition, we can also concoct Scalar 
Variable Multipliers in the following fashion to generate 
Equations from the Basic Equation: 

We consider construction of the Scalar Variable Multipliers of 
the form 


Vv 


yh I] I] [T(s) where 
v={5,} =(0, i=l yn } 


0, €N with small N. 
i={0,}c {1 2,3......(r-1,7} 
j={a,,}c N with small N 

v ={5,}c N with small N 
mot 


and R -Ya, 


M=l 


suhiene papa iv)+ B, +Maxa, } = ; + S«,| Equation 


i=ltor 
(34+s+7) 

and r={12,3, ies ; (R—1),R} and f,,a@,¢€I, that is f,,a, are 
Single Element Sub-Sets of I’. Also, it is to e noted that we have 
to consider evaluation of the aforementioned Multiplying 
Factors for all possible Sub-Sets {0,}, {a,,} and {5,}. This also 
gives us some number of additional Equations among which 
(including the already generated Equations), we select the Best 
r Equations to use along with Equation (34+s+6) to solve for 
the m,'s and c. For achieving this, we have to also find all 
possible Sets of and R number of Equations from the Complete 
Set of aforementioned generated Equations, excepting Equation 
(34+s+6). Such best Set of R Equations when used along with 
Equation (34+s+6) to solve for m,;'s and c Minimizes the Error 
é . (Needless to mention, we have to use all R values in order to 
achieve Optimization. Furthermore, and finally, we can note 
that Equation (34+s+6) whose highest Degree is 1 can be 
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rendered into an Equation with Highest Degree (R+1) in the 
following fashion: 
That is, by multiplying the ene (34+s+6) throughout by a 


factor y” IT Il lt) ;,) only but satisfying the below given 
v={6, }i={0,} jai 


constraint: 

EXC iv) )+ B, + Mada, =f Soa,}-e Equation 
i=ltor H=l 

(34+s+8) 


with z taking all possible Positive Integer Values for each h. 
We again now find the Optimal Regression Coefficients m ,'s 


and c that best Minimizes the Error ¢ . 
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6 ERROR ANALYSIS 


The same Error Metrics such as in the conventional Multiple 
Linear Regression Analysis and Multiple Non-Linear Regression 
Analysis can be used for the Optimal cases as well. 


Exhaustive Modeling Of Error And According Updation Of The 
Model 


The Errors for all €; for all (x,,;) which are n in number, for 
the afore-described Model are again modeled using the same 
Model again using the following transformations for the 
updation of the Model to include or explain the Error: 


Case 1: Optimal Multiple Linear Regression 


MNyg = OqMNij with Pp # Ly q a (j, Pp =1 ton) Equation 
(34+s+9) 
which consequently leads to the transformation relation 


(1, = nals ) << Wy = Pog (mm, a K)j Equation (34+s+10) 


and also, © = @.!Mj Equation (34+s+10) 
leading to the transformation relation 


(c = w,m,)> e maa: (m, * x). Equation (34+s+11) 


Case 2: Optimal Multiple Non Linear Regression 


N 


Wie >to, yr m,| with D#1l, q# (i. p=l ton) Equation 
“ie 
(34+s+12) 


which consequently leads to the transformation relation 


fre = Solos} mI» me = Shou? +f nation 


(34+s+13) 
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N 


ae aa . . 
and also, ©= > {o,) m,| leading to the transformation 
H=l 


relation fe = oy {o, Ym, ae fe => fo, }-(m,+ a 


i=l i=l 


Equation (34+s+14) 


where N is a Set Of Positive Integers with small N- 


We keep repeating this procedure again and again till we 
achieve Error Convergence after some such Steps of Updation of 
Model for the case of Optimal Multiple Linear Regression Model 
and till we achieve Zero Error after some Steps of Updation of 
Model for the case of Optimal Multiple Linear Regression Model. 
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